EQUIVIO at TREC 2009 Legal Interactive
نویسنده
چکیده
Equivio participated in two runs under the legal interactive track: topics 205 and 207. The runs utilized the Equivio>Relevance product. Equivio>Relevance is an expert-guided system which enables automated prioritization of documents and keywords. Based on initial input from a lead attorney, Equivio>Relevance uses statistical and self-learning techniques to calculate graduated relevance scores for each document in the data collection. Equivio>Relevance works like this: • The software generates a sample of documents, and feeds them to an expert, that is, an attorney who is familiar with the case. • The expert tags the sample documents as relevant or not. • These samples are used to train the software. Based on the expert's decisions for the sample documents, the software learns how to score documents for relevance. • In an iterative, self-correcting process, Equivio>Relevance feeds additional samples to the expert. These statistically generated samples allow Equivio>Relevance to progressively improve the accuracy of its relevance scoring. • Once the sampling process has optimized, the software indicates that it has reached "stable" mode. "Stable" mode is an indication that additional samples are not required. At this point, the batch scoring can be invoked. In this process, the software calculates a graduated relevance score for each document. Each sample comprises 40 documents. Typically, 25 through 50 iterations, each of 40 documents, are required in order for the software to stabilize and optimize the sampling. Equivio>Relevance comprises three key modules – an interactive model used by the expert for tagging documents, a statistical model which serves to monitor and manage the iterative sampling process, and a text classification algorithm. The statistical model manages sample selection. The samples are fed to the expert, and the input from the samples is used to train the classifier.
منابع مشابه
Interactive Task Guidelines TREC 2009 Legal Track
In 2009, the TREC Legal Track will again be featuring an Interactive Task (along with a Batch Task, for more on which see the Legal Track website [1]). This document contains guidelines for this year’s Interactive Task, with particular focus on aspects of task design that are new in 2009; the document also covers specific steps those interested should take to register for and begin the task.
متن کاملTREC 2009 at the University of Buffalo: Interactive Legal E-Discovery With Enron Emails
For the TREC 2009, the team from University at Buffalo, the State University of New York participated in the Legal E-Discovery track, working on the interactive search task. We explored indexing and searching at both the record level and the document level with the Enron email collection. We studied the usefulness of fielded search and document presentation features such as clustering documents...
متن کاملA Model for Understanding Collaborative Information Behavior in E-Discovery
The University of Pittsburgh team participated in the interactive task of Legal Track in TREC 2009. We designed an experiment to investigate into the collaborative information behavior (CIB) of the group of people working on e-discovery tasks provided by Legal Track in TREC 2009. Through the studies, we proposed a model for understanding CIB in e-discovery.
متن کاملClearwell Systems at TREC 2009 Legal Interactive
The TREC Legal Track 2009 features an Interactive Task that is designed to replicate real-world challenges in producing a collection of responsive documents from a large collection of documents. The task required us to produce responsive documents from any of the seven topics, which are production requests. Clearwell Systems incorporated novel methods for producing a responsive collection using...
متن کاملOverview of the TREC 2010 Legal Track Notebook Draft 2010 . 10 . 25
The TREC 2010 Legal Track consisted of two distinct tasks: the learning task, in which participants were required to estimate the probability of relevance for each document, and the interactive task, in which participants were required to identify all relevant documents using a human-in-the-loop process. 2010 is the fth year of the legal track, the third year of the interactive task within the ...
متن کامل